A Multi-Resolution CRNN-Based Approach for Semi-Supervised Sound Event Detection in DCASE 2020 Challenge

نویسندگان

چکیده

Sound Event Detection is a task with rising relevance over the recent years in field of audio signal processing, due to creation specific datasets such as Google AudioSet or DESED (Domestic Environment Detection) and introduction competitive evaluations like DCASE Challenge (Detection Classification Acoustic Scenes Events). The different categories acoustic events can present diverse temporal spectral characteristics. However, most approaches use fixed time-frequency resolution represent segments. This work proposes multi-resolution analysis for feature extraction Detection, hypothesizing that resolutions be more adequate detection sound event categories, combining information provided by multiple could improve performance systems. Experiments are carried out dataset context 2020 Challenge, concluding combination up 5 allows neural network-based system obtain better results than single-resolution models terms event-based F1-score every category PSDS (Polyphonic Score). Furthermore, we analyze impact score thresholding computation results, finding standard value 0.5 suboptimal proposing an alternative strategy based threshold each category, which obtains further improvements performance.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sound Event Detection for Real Life Audio DCASE Challenge

We explore logistic regression classifier (LogReg) and deep neural network (DNN) on the DCASE 2016 Challenge for task 3, i.e., sound event detection in real life audio. Our models use the Mel Frequency Cepstral Coefficients (MFCCs) and their deltas and accelerations as detection features. The error rate metric favors the simple logistic regression model with high activation threshold on both se...

متن کامل

Experiments on the DCASE Challenge 2016: Acoustic Scene Classification and Sound Event Detection in Real Life Recording

In this paper we present our work on Task 1 Acoustic Scene Classification and Task 3 Sound Event Detection in Real Life Recordings. Among our experiments we have low-level and high-level features, classifier optimization and other heuristics specific to each task. Our performance for both tasks improved the baseline from DCASE: for Task 1 we achieved an overall accuracy of 78.9% compared to the...

متن کامل

STED: Semi-Supervised Targeted Event Detection

Social microblogs such as Twitter and Weibo are experiencing an explosive growth with billions of global users sharing their daily observations and thoughts. Beyond public interests (e.g., sports, music), microblogs can provide highly detailed information for those interested in public health, homeland security, and financial analysis. However, the language used in Twitter is heavily informal, ...

متن کامل

SMACD: Semi-supervised Multi-Aspect Community Detection

Community detection in real-world graphs has been shown to benefit from using multi-aspect information, e.g., in the form of “means of communication” between nodes in the network. An orthogonal line of work, broadly construed as semi-supervised learning, approaches the problem by introducing a small percentage of node assignments to communities and propagates that knowledge throughout the graph...

متن کامل

Semi Supervised Approach Based Brain Tumor Detection with Noise Removal

Brain tumor detection and segmentation is the most important challenging and time consuming task in the medical field. In this paper, Magnetic Resonance Imaging (MRI) sample image is considered and it is very useful to detect the Tumor growth. It is mainly used by the radiologist for visualization process of an internal structure of the human body without any surgery. Generally, the Tumor is cl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Access

سال: 2021

ISSN: ['2169-3536']

DOI: https://doi.org/10.1109/access.2021.3088949